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ABSTRACT 


In Chapter I the problem of robust estimation of location 
parameter is introduced and an outline of results of J.W. Tukey - 
who first investigated this problem - is given. 

In Chapter II an important method of robust estimation 
of location based on rank tests is shown, due to J.L. Hodges Jr. 
and E.L. Lehmann. 

In Chapter III we present the minimax approach toward the 
theory of robust estimation of location parameter, due to P.J. Huber. 
Also an important class of estimates based on minimal principle is 
introduced and a unique most robust estimate for the location para- 


meter —- according to an accepted measure of robustness - is defined. 


In Chapter IV some new estimates of location parameter 
are presented which in a sense lie between sample mean and sample 
median, and which are based on minimal principle. According to an 
accepted measure of robustness, two most robust estimates for location 
parameter in two classes of estimates are defined, and some of their 


properties are investigated. 
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CHAPTER I 


ROBUST ESTIMATION OF LOCATION PARAMETER 


I.1l Introduction 
The problem of the theory of point estimation is to develop 
methods for finding estimates on the basis of sample values Xpoee 3 


i.e., on the basis of observed values of the random variables Xpsreeek 
whose distribution function is assumed to be Fo), which, according 
to certain criteria, are best for estimating the unknown parameter 6 
of the assumed underlying distribution function P(x). 

The assumption of normality of the distribution function 
F(x) is frequent in classical statistical methods. 

In reality we never have a complete knowledge about the 
true underlying distribution function, and hence the assumed and the 
true underlying distribution functions may differ. Therefore it is 
necessary to know how an estimate, obtained by a certain method will 
perform, if the assumed underlying distribution is slightly changed 
or replaced by a different one. 

It seems to be a desirable property of an estimate to have 
stable performance, at least when we allow for a small change in the 
eyained underlying distribution function. This was recognized in the 
past, but until recent times not enough attention was paid to this 
basic problem. The theory of robust estimation deals with this prob- 
lem and tries to exhibit estimates, whose performance is relatively 
stable not only under one set of circumstances, as it is for example 


in the case when we assume only a fixed underlying distribution function 
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F(x) in the classical problem of point estimation of a single 
unknown parameter 0. 
The problem of robust estimation of a location is to develop 
methods for finding estimates, on the basis of observed values 
Sr of the random variables Xpseee sk having distribution 
function FY (x) = F(x-8), which according to certain criteria are 
stable (robust) estimates of the unknown location parameter 0 of 
the distribution function F (x) = F(x-60), when the prototype dis- 
tribution function F(x) is assumed to be known only approximately. 
In case of a symmetric prototype distribution function 
a pape F (x) +F (=x) = 1, the center of symmetry of the distribution 
F(x-8) is considered as the unknown location parameter @ - a natural 
quantity to estimate in this situation. 
Thus the problems of the robust point estimation are: 
(a) to choose an appropriate model of indeterminacy which will 
play the role of the prototype distribution function of 
the classical theory of point estimation; 
(b) to define a most robust estimate in a reasonable class 


of estimates. 


I.2 Outline of Tukey's Results 

The shortcomings of the classical methods of estimation - 
the assumption of normality and consequently their vulnerability to 
gross errors were investigated by J.W. Tukey and the Statistical 
Research Group in Princeton in the late forties. 

A survey paper by Tukey [14] published in 1960 is considered 


as a starting point in the theory of robust estimation. Tukey con- 
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sidered a model of indeterminacy with the prototype distribution 
function - the so-called contaminated normal distribution of the 


form: 


where ce € [0,1] is the fraction of contamination and h is the 
scale ratio of the two component normal distribution functions. In 
his investigations a scale ratio h = 3 has been used throughout. 

The purpose of the contaminated normal distribution function 
was to replace the exact normal distribution function in the classical 
problem of estimation of the location and the scale parameters. 

Tukey has shown, that the classical estimate of location 
parameter wu of the distribution function J hee - the arith- 
metic mean, and the classical estimate of scale parameter o of the 
distribution a - the standard deviation, both have unsatis- 
factory aspects, namely their variances exploded even for a very 
small amount of contamination fraction e. 

Tukey proposed estimates for location and scale, which are 
more robust than the classical estimates. The asymptotic efficiency 
and the asymptotic effective variance have been used as a criteria 
of robustness of an estimate in location and scale problems respec- 
tively. 

As an estimate of location parameter wu, the a-trimmed mean 
(the arithmetic mean of those observations, which remain, when the 


a% lowest and a% highest have been set aside), Ks 
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order statistics of the sample X 2X from the distribution 
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As an estimate of scale parameter 0, the mean ‘deviation 
was proposed for smaller samples and in larger samples, the a—truncated 
standard deviation (the standard deviation of those observations, 


which remain, when the a% lowest and a% highest have been set aside), 


s 
a? 


1 pial 
s = {{n-2[an]-1} ) Cy 
. i=[an]+1 


was proposed, where 0O < a < ma a Ae denote the ordered 


bed ee ae 
2 (1) (n) 
statistics of the sample Kore ok, from the distribution BS 3(*/9). 


Tukey showed, that the a-trimmed mean ee is asymptotically 
normal, if the underlying prototype distribution function F(x) is 


symmetric and has a density, which is continuous and strictly positive 


on ix - 0 < F(x) <1}. Thus 
iT 
20 2 
L[n (X71) ] ei N(O,o (a)), as a ee > 
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and x(a) is the a-quantile of F(x). 
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CHAPTER IL 


EFFICIENCY-ROBUST ESTIMATES OF LOCATION 


II.1 Outline of Results of Hodges and Lehmann 


In this chapter we shall present an important method of 
estimation due to Hodges and Lehmann [5] which leads to robust estimates 
of location. Their approach is one of the first attempts to deal 
with a weak point of some classical methods of the theory of estimation 
- the assumption of normality and consequently their vulnerability 
to gross errors. They realized that for the problems of point estim- 
ation the methods successful in the corresponding testing’ problems 
could be applied. The rank tests such as the two Wilcoxon tests or 
the Kruskal-Wallis H-test have more robust powers against gross 
errors than the t- and F-tests, and their efficiency loss is quite 
small even in the rare case in which the suspicion of possibility of 
gross errors is unfounded. The method of estimation of location or 
shift parameters proposed by Hodges and Lehmann is based on rank 
test statistics such as. the Wilcoxon or normal scores statistics, 
which are successful in providing robust power for the corresponding 
testing problems. 

In the following sections we shall summarize the results 


of Hodges and Lehmann. 


Il.2 Point Estimates Based on Test Statistics 
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n 


with distributions 


TY AITIAHD 
MOTTAIO FO'ALAMIT2T T2UHOM-YOUSIOITA 
nosmusl bas param jo ealueet ie sntlivO . = i 
| naa 6 
to bodesm Jostvoqui ns saseorg Ilede ow 193geds aids ol 
asiemizes Jevdox 63 ebsoi doidw [¢) ansmidod bas asgboH o3 sub aoltemid 
Isob o3 2adqmetts sexi} of9 to sno, af daso1gge aiedT notssool 2 
nottemijeo jo yuosils silt 10- ebods em {sotaasto smoe to jniag Asow Ss ote 
xdblidsisnivy 1t9od3 ylinsupsenos bas yilsiron to noisqmuees ata, 
—miszee akog jo ameldoiq sit rot Jen post Tese aaa? <BIOTID ez07g™ 
tained gnites3 yatbaoqes1109 oli ok lutesssaue abodjem od3 no! 
x0 et293 moxooliW owl 913 es dove asaoz ‘dnag sur .betiqqs sd. so 
seca ret ts eyawog teudor Ssrom ovest 3ea3-H 21 isW- Lea 

| | a 
situp ef athe yYonsiottis aisdd bas eses3—4 pe —2 Ss nad -_ 
Yo yiilidieeog-io aolotgqeue od3 rloidw nf s2e69 Sapa sda oi neve Die 
to moitssol io rotbaesas to bordism siT -bobnyolay:ei eroxs9 120 
jnst no bsesd at msimdod bre neakor yd bosoaos paren: a 


,~2dileiipie ao1022 Lom1oA Yo noxooliW oid a deus’ eoisalisie 3 jes 
> a i] 


> , > a4 


_antbnogeszt09 ee) jot +vewoqg teudor stiblverq nk Lvteassoue 918 


azilveor oi ssitsmmye [fede ow enobtose gaiwo 


é 


“ - _? 


7 


goldsizey mobaes 4 snobasqobat of 7 Pee 


ap & 


re 


{9 -1) PIX eiSvanl, ikGa) 5 pty, <ul = F(u-d) 


The random variables Xpoeee sk ee ee are then 


i.i.d. random variables, since the random variables X74 -++,¥-A, 
are obtained by shifting the Y-sample A _ to the left. 

The idea is to estimate A by the amount of shift needed 
to align as closely as possible the two sets (X)5++-5X)5 
praoeee5¥ mA). The definition of alignment could be given with 
reference to the Wilcoxon statistic, by defining the two sets to be 
aligned, if half of non-zero differences (Y,=8)-X, are positive 
and half negative. There is either a unique such value A, which 
could then serve as an estimate, or an interval of such values; in 
the later case, the midpoint of this interval provides a natural 
estimate. 

More perler dl 154 if a test of hypothesis A = 0 is based 
On a statistic, whose distribution is symmetric about a point u, 
the two sets could be defined in alignment, when giving to the test 


Statistic, the value uw. Let us assume now, that Fe Fy or Fe Fie 


where 


{set of all continuous distributions} , 
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{set of all continuous distributions, symmetric 


about zero} . 


Consider a test statistic 


h(X); eee acta e, eee ¥ ) 


for the hypothesis H : A = 0 against the alternative A #0. We 
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shall assume throughout, that h satisfies: 
(A) A(X, +++ XV tas- ++ sy ta) is a nondecreasing function 
Ce ee ror alr x andy. 
(B) when A=0, the distribution of W(X) s+++sX 5¥i5++-5¥) 
is symmetric about a fixed point wu (independent of F), 


fi) for all Fe Fa Or (12) for alt -Pre Fie 


We shall use the following abbreviations in the notation: 
x = (Xj s+0+5X)5 y= (Yporeee¥)3 x < x!' means that the inequality 
holds for each coordinate; if a is a real number, then x+ta = 


(x, ta,-.+,x +a). The notation Et will mean, that the probability 


in question is being computed for the case A= 0. 
rs 1, *  ** 
Definition 1: A = 364 +A ) , where 


k 
A = sup {A : h(x,y-A) > u} , 


and 


kK 
A inf {A : h(x,y-A) < yu} 


a 


Then A is proposed for the two sample problem as an estimate of 
shift parameter A, for a suitable function h. 
In the case of one sample problem, suppose Zysrersky 


are independent random variables with common distribution 
Piz, < ul = F(u-®) , 


where F is continuous and symmetric about zero, i.e., F « Fe 
Similarly as in the two sample problem we will base an 


estimate on a test statistic h(Z)9+++s2y) for the hypothesis 6 = 0 
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against the alternative © #0. We shall again assume throughout, 


that,. h. «satisfies: 


(C) h(z,ta,...,Z,+a) is a nondecreasing function of a for 


each 2z. 
(D) for 6+ 0, the distribution of h is symmetric about 


a fixed point uy (independent of F) for all Fe Foe 


If wu is the median of h(Z), when 6©= 0, we define 


Bows ~ | k kk 
Definition 2: 6 = 568 HG. ees , © where 


* 
v) 


i} 


sup {6 /hi(z-0)* >}, 


and 


KK 


fa) 
i 


inf {6 : h(z-6) < u} 


a 


Then 6 is proposed for the one sample problem as an estimate of 


the location parameter 60. 
IIl.3 Estimates Based on Rank Tests 


An important class of rank statistics for the two-sample 


problem is given by 


SHE (s,) 
(3.1) h(x) ye gree ie earl 


ha 


where »S denote the ranks of OCT: ae in the combined 


poms Sati stacey) 


Syoses 


sample and where denote an ordered sample of 


size mtn froma distribution 'Y. 


The function h defined by (3.1) satisfies requirement (A). 
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Conditions under which h satisfies requirement (B) are given in 


the following lemma, in which h is not assumed to satisfy (3.1). 


LEMMA 1: The distribution of h(X,Y) is symmetric about u, if 
any of the following three conditions hold. 


(i) h is a function only of the ranks and satisfies 
(32) h(x,y) + h(-x,-y) = 2n (a.e.P) 
(ii) the sample sizes m and n are equal and h_ satisfies 
(3.3) h(x,y) + h(y,x) = 2u (a-e.P) 


(iii) the distribution F is symmetric about zero and h 


satisfies (3.2). 


Conditions under which the function h defined by (3.1) satisfies 


(3.2) or (3.3) are given in the following lemma. 


LEMMA 2: Let h_ be defined by (3.1). Then 
(i) if ¥ is symmetric about b, the function h_ satisfies 
(3.2) with wu = n-b 
(ii) if m=n and b denotes the expectation of ¥ the 


function h satisfies (3.3) with yu = $(mtn)b. 


It follows from Lemmas 1 and 2, that a function h given 
by (3.1) satisfies condition (B)(i) of the preceeding section if 
either ¥ is symmetric or the two-sample sizes m and n are equal. 

Among the statistics given by (3.1) and satisfying (B) (i) 
we shall be interested in the Wilcoxon statistic and the normal score 


statistic, obtained by taking for ¥ a rectangular or normal dis- 
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tribution respectively. 
Suppose now the sample sizes are equal, i.e., m=n. 


- 1 
Denote the sample mean by x = m Okyte AK)» and the sample median 


by 
nf gik tl) if m= 2k+l 
* = med x = 
a 
p(x) + aS 1)) Lr) mi 2 
uf 
where - ) St, oe hy denote the ordered x's. We can see, that 


h(x,y) = y-x and h(x,y) = y-x both satisfy (3.3) with uw =0 


* ke ace 
and A =A =h. The estimates for shift are therefore Y-X and 


~~ ~~ 


Y-X, respectively. 


If in addition to (3.3) we impose on h_ the condition 
(3.4) h(x,yta) = h(x,y)t+a foriealiveeal waa; 


then we can assume without loss of generality that wu = 0 since 
the function h'(x,y) = h(x,y-y) satisfies (3.3) with wy = 0. 


* 
Condition (3.4) then implies A =A  =h, since for example 


AP x.y) = inf {A:h(x,y-d) < A} inf {Ash(x,y) < A} = 


h(x,y) 


Suppose now, that h(x,y) is the test statistic of the Wilcoxon 
two-sample test in the Mann-Whitney form, i.e., h(x,y) is the number 
of pairs (i,j) such that x, < x, gk ea GRO les a, ek Fa a 9 

This is equivalent to the test based on (3.1) with Y¥, 
the rectangular distribution on (0,1). The values of function h, 
which satisfies requirement (B)(i) are the integers 0,1,...,m.n. 
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To find an explicit expression for estimate A of Definition l, 
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(1) yom) 


see? 


based on h(x,y), denote by W the ordered 


differences Y,-X Suppose first m.n is odd, mn = 2k+l, say. 


av. 
uf 
Then ~p"=*k. + 3? and 
KK 
A = inf {A:h(x,y-A) < k + 5} = 
= inf {A:fewer than k + 5 of the differences ap 
exceed A} = 
= int {aw <4} = yO 
Similarly 
* 1 
A = sup {A:more than k + > of the differences re, 
exceed A} = 
4 k+ 
= sup ta:ws* eu Pate w' = 
Hence A = gcse Suppose now m.n is even, m.n = 2k, say. Then 
ak k+1 
A = inf ta:wst) SAT Pes weet 
* (k) (k) 


A = sup {A:W’ > A} =W ; 


hence 

A = x wh? + rbediraa 
Thus 
63.5) A = med [Yj-Xyl ; 


is the median of m.n differences of 2 Similar is the 
estimate in the case of normal scores. Next we shall consider the 


case of one-sample problem. 
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aie be independent identically distributed 


N 


random variables with distribution F(u-8), where F is symmetric 
about zero. Let Spores Ss) denote the ranks of the positive Z's 


among N absolute values |z Seas Here n is a random variable, 


Asante 


which for 8= 0 has the binomial distribution B(N3 5). 


A class of rank tests is based on the test statistic 


n (s.) 
(3.6) h(z) =) Fey 2) 
; Y 
ata 
(1) (N) 
where V ee. se denote the ordered absolute values of a 


sample of size N froma distribution ¥. The function h, given 
by (3.6) satisfies requirement (C) of the previous section. The 


function h satisfies requirement (D) if 
(3.7) h(z) + h(-z) = 2, (a.e. Ry) 


The following lemma states, that the requirement (D) is in 


fact satisfied for any function h defined by (3.6) 


LEMMA 3: If h is given by (3.6) and @ = 0, the distribution of 
1 
h is symmetric about uw =7N.E lz. | for all F e F.. 
2 ama 1 
An important case is again the Wilcoxon statistic, which corresponds to 


the choice of rectangular distribution for Torin: (356). 


To find an explicit expression for estimate 8 of 
Definition 2, based on h(z) we shall use an equivalent form of 


test statistic h(z) due to Tukey [13], namely: 


(3.8) h(z) = Numbers of pairs (i,j) with l<i<j<N 
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The possible values of h are integers of7; 2) 4 mony Let 
eo ee be the K = = N(NtL) averages elaaly witli fe aie 


Then similarly as in the case of two-sample problem 


a Z.+Z, 

(3.9) 6= med. 
DAE ae 
the median of NONE) averages “5 J 


Other estimates are obtained by taking for h a function, 
that satisfies (3.7) and the following translation invariance require- 


ment: 
(3.10) h(zta) = h(z)+a for all*real a2 


As in the corresponding case of the two-sample problem we 
can assume without loss of generality that w= 0 and then 6(z) = h(z). 
Examples of this are functions (i) h(z) = Zz and (ii) h(z)-= Z. 


For the proofs of the above lemmas see [5], pp. 601-603. 


IIl.4 Properties of Estimates 


A) Small Sample Properties of Estimates 
The estimates A and 0 of a shift or location parameter 
are translation invariant and approximately median unbiased. 
The following ease give conditions under which the 
distribution of A and ; are symmetric so that in particular the 


estimates are unbiased. 
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THEOREM 1: The distribution of the estimate A of Definition 1 is 


symmetric about A if either one of the following conditions hold: 
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(i) the distribution F defined in (2.1) is symmetric and 


h satisfies (3.2) and the invariance relation 


(4255 h(x+a,yta) = h(x,y) for-ahit4gcs, 


(ii) the two-sample sizes m and n are equal, and h_ satisfies 


(3.3) and (4.1). 


n 


Coroilary. “If? fis" piven py"{S. 19 chenPthe distribution of, .A. vis 
symmetric about A if either one of the following conditions holds: 
(i) the distribution F and ¥ are symmetric 


(ii) the sample sizes m and n are equal. 


a 


THEOREM 2: The distribution of the estimate @ of Definition 2 is 
symmetric about 90 if 
(i) F is symmetric about zero and h_ satisfies (3.7) and 
hence in particular if 
(ii) h is given by (3.6). 


For the proofs of the above theorems see [5], pp. 605-607. 


B) Asymptotic Properties of Estimates 
The following theorems will be concerned with asymptotic 


properties of estimates A and 98. For that purpose we shall use 


the following notation. 


For the two-sample problem, let m(N), n(N) for N= 1,2,... 


be a sequence of pairs of sample sizes tending to infinity in such a 
way, that m(N)/N + A, say, and let AN be a sequence of values of 
the parameter A. Also, for the one-sample problem consider the 
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values of 9. In both cases we shall indicate the dependence of h 


and uw on N_ by writing hy and Wy 


THEORFM 3: Let a,c Cos -e-, be real constants, and let 


eg 
a a 
dl aa apes * 208 0 Sy ree 
N Cy N Cy 


Let G be the continuous distribution function of a random variable 


with mean zero and unit variance, and suppose 


ut+aB 


lim P fc. ¢h yrs 7} 2 SAS an we : 


~U 
Nisiabs N NSN 


where Py indicates, that the probability is computed for the para- 
meter values A on. 0 and where hy stands for’ 


N N 
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Then for any fixed A and 9, 
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(4.3) Aste Paice, (6, §) <a} = G( a) 


Consider now test statistics given by (3.1). Then from 
the results of Chernotf and Savage [2], (1958) or Puri [12], (1963) 
Theorem 7.1 under suitable regularity conditions on Y, N? Eh, (X,Y) Wy] 
satisfies the assumptions of Theorem 3, with G, the standard normal 


distribution, and with A and B_ given by 
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and 


: d . 
(4.5) B= A(1-A) f t= GIF(x) 1) FF), 
where J = Y . This together with Theorem 3 gives: 


THEOREM 4: If h is given by (3.1) with Y satisfying the assumptions 
of Theorem 7.1 of, Puri, (1963), and if. m(N)/N 2 as Ni> ene then 
N (74) has a limiting normal distribution with mean zero and variance 
A) 8° where A and B are given by (4.4) and (4.5). 

For particular case ¥ being the rectangular distribution 
on (0,1) we have ‘\e = d(1-A), B= dA(1-A) fe? cx) dx, where f(x) 
if the density of F(x). The asymptotic variance of N2(A- A) in this 


case is given by: 
(4.6) L/[12A(1-A) (f £7 Ge) dx) 7] 


The following theorem shows, that under suitable regularity 


conditions, the estimates A and 60 have desirable efficiency 


properties. 
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THEOREM 5: Let Ay and Ay (or Oy and 6 be estimates of A 


! 
‘(or 9) based on sequences of test «statistics ha and hy satisfy- 


ing the assumptions of Theorem 3 for the same limiting distribution 


atl a 


G. Then the asymptotic relative efficiency of AN relative to Ay 


at “a 


(or of On relative to 0 in the sense of reciprocal ratio of 


asymptotic variances, is the same as the corresponding Pitman 
' 
efficiency of the two sequences of tests based on hy and h 


: 3 
provided the latter exists and Cu = Cy = N’. 
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the above theorems see [5], pp. 608-610. 

It follows from this theorem, that the efficiency of the 
estimates med ac aty of med (2+) 72) relative to the 
classical estimates Y-X and Z respectively is 1207 (f ten abe 
which in the case of normal F is = #= 0.955. 

It is interesting to compare this value with the correspond- 


ing values for small N. For N#=1 and 62, N= Z so that the 


efficiency is 1. For N= 3 
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Lop Z5 Ly Z., Z, Z, 


RR Cera Soe eran era ae ak 
Let the ordered Z's be denoted by zh) < 762) < z) Then 
OB) ed 2, (2) 43) 
761) Lg Sax! 2 7 (2) Pe Ba : 763) 
2 2 
and 
ECAC BONA Uy 
Z. < < < <2, 
Z 2 2 
; Pe; Shan 2) 
From these inequalities it follows, that 8, = average of Z 
(igs 43.) 
and oe » so that 
0, - a2‘? + 228) 4 22] 


From a table of the covariances of normal order statistics, the 


efficiency of 0, = Usos o. 


Generally, for any F with bounded density f, the value 
1207(f meats > 0.864, hence the estimate based on Wilcoxon Lest 


is efficiency robust. 
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CHAPTER III 


MINIMAX APPROACH FOR ROBUSTNESS 


III.1 Outline of Huber's Results 

In this chapter we shall present the approach of P.J. Huber 
[6] toward a theory of robust estimation. For the problem of estimat- 
ing a single location parameter, he developed a general method for 
obtaining estimates based on minimal principle. 

In his work, he considered two models of indeterminacy. In 


the first model the prototype distribution F (x) was given by 
F(x) = (l-e)o(x) + cH(x) , 


wheres Ov@ver <li idsta known number, (x) is the standard normal 
distribution, and H is an unknown contaminating distribution. 

(This model of indeterminacy is a generalization of the model studied 
by Tukey.) In the second model F(x) was given by 


sup. |F (x)-0(2)| 26 « 
—O< KX <0 € 


Under suitable regularity conditions, asymptotic normality 
of the estimates based on the minimal principle was shown and an 
explicit expression fan the asymptotic variance was given. 

Using a minimax approach Huber showed that for the first 
model of indeterminacy there exists a unique most robust estimate 
te cee minimal principle, if as a measure of robustness of 
an estimate we accept the inverse of the supermum of the asymptotic 


variance over the set of all contaminated distribution functions. 
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The most robust estimate li of location is defined by 


n 
is p(X,-T ) = minimum , 
i=1 
where 
ptt) = ot for | t | mK 
= k|t| - $k for eee ke 


with k depending on e. 
It was shown in [8], that the above estimate is asymptotically 
equivalent to the trimmed mean x, introduced by Tukey. 


In the following sections we shall summarize Huber's results. 


III.2 Point Estimates Based on Minimal Principle: The M-Estimates 

The method of least squares proposes the value which minimizes 
the sum of squares of differences between observed and expécted values 
as an estimate of the unknown parameter. In the case of estimating 


a single location parameter one has to minimize the expression 


n 
) (X,-1)°. This is achieved by sample mean T = = X 
i=l i=l 


a 

It is natural to ask whether one can obtain "more robustness", 
by minimizing another function of the differences between observed 
and expected values, than the sum of their squares. Let Xp oeee yk 
be a random sample. The estimate Ti & = T Xp e eX) of location, 
based on minimal principle (M-estimate for short) is defined as a 
solution of the expression 

n 


(221) ) p(X, -T (X)) = mimimum , 
i=l 
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where p is a non-constant function. 

This class of estimates contains in particular 
, : 2 
(i) sample mean if p(t) =t 


(ii) sample median if p(t) = ele 


After defining the class of M-estimates, a criterion of 
robustness of an estimate must be agreed on, which will allow us to 
choose the best estimate from a sét of estimates (with respect to 
that criterion). Unfortunately there is no unanimity about this 
question, We shall accept as a measure of me cae of an estimate, 
the inverse of the surpremum of the asymptotic variance, when F 
ranges over a Suitable set of underlying distributions in particular 
over the set of all F = (1-€)®eH for fixed e and H-symmetric. 

It will be shown, that if we accept this measure of robustness and 
we restrict attention to M-estimates, then the most robust estimate 


of location corresponds to 


p(t) ot? = for |t| < 


and 


p(t) =k] ¢| -5k for it Pre 


where k depends on e. 


Thus the most robust estimate of location TO is defined by 


(2,2) dee VK, ele BaGegs 
i=l 


where w(t) = t for [| <k and w(t) = k. sgn (t) for lt | ek 


with k depending on €. 
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III.3 Asymptotic Normality of M-Estimates for Convex 0 
In this section we are assuming throughout, that p(t) is 


a convex real-valued function of a real variable t, tending to 


apes licks Ae Sa was - 


Definition 1: Let Xjpoeee Kk be independent, identically distributed 
random variables, with common distribution function F. Let [T (X)] 
n 


be the set of all those &, for which Q(é) = ) p(X,-6) reaches 
i=l 


its infimum QinF . Then we define the M-estimate T &® as any 

representation of the set valued function th, Syed > [T (x) ] 

by a single valued function (Xj 5-+-,X.) > TT &) € [T_ (&)] for 

instance TOO = midpoint of [T (X)] if [T @] is an interval. 
The set [T (x) ] is invariant under translation, 


Pe aly [T (Xtc) ] = [T_(X)]+c. 


LEMMA 1: Q(&) is convex function of &, and [TQ] is non-empty, 
convex and compact. If p is strictly convex, then [T (X)] is 


reduced to a single point. 


PROOF: (Strict) convexity of Q follows from (strict) convexity 
os rama 
TI ets E = £/Q(é) S.i0 ba Mom bees was form a 
1e S iy = inf m 9» > 
decreasing sequence of non-empty, convex, compact sets as m7 ™, 
hence their intersection [T (x)] is non-empty, convex, compact. 


| " 
If oe is strictly convex, and if & , & were two distinct points 


of [tT ()], then we would have 
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which is a contradiction. Hence ¢ =¢ = T &)- 

Let v(t) = p(t) be the derivative of P(t), normalized 
such, that w(t) = = W(t-0) ao > v(t+0). ~ is monotone increasing 
and strictly negative (positive) for large negative (positive) values 


Of wie 
Definition 2: If w(t) is continuous, then TW @&) is the solution 
of the equation 


n 


L  v(X,-T&)) = 0 


i=1 
Definition 3: Define 


ME) = f w(t-E)aF(t) = ELW(X-E)] 


LEMMA 2: If there is a B5 such that ME) exists and is finite, 
then ACE) exists for all €, (possibly A(&) = +), is montone 
decreasing and strictly positive (negative) for large negative 


(positive) values of €. 


=, + =e 
PROOF: Let py=w -wW , where pp and wp are the positive 


and negative parts of wy respectively. Then 
nce) = f wh (e-eyaP(e) - fv (e-e)ar(t) 


For &=€ both integrals exists and are finite. For & > €) the 
w S 


first integral is bounded, 


0< fw (e-eyar(t) <f yi (t-E )aR(t) 
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and similarly for & < eo the second integral is bounded 
O< fv (t-E)aF(t) < fy (t-€)dF(t) 


Hence at least one of the two integrals is finite, thus 
“X(&) exists everywhere. A(&) is monotone decreasing in & since 
v(t-&) is. 

Now we want to show, that A(&) is strictly negative for 
large positive values of & and positive for large negative values 
of &€. Because of the monotonicity of yp, it is sufficient to 
prove the first assertion. 
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f Wt-eaF(t) <e , 
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which implies that 


futaeeyar(t) +0 for —>@ 


Since ~ takes upon strictly negative values, due to monotonicity 
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f ¥ (t-E)dF(t). > 6 


for sufficiently large —, thus A(Cé) is strictly negative for 


large values of 6&. 


LEMMA 3: ("Consistency of ie sea Assume that there is a c_ such 
thagmee eee tor &— <c and (8) <0 for ££ >-c. Then TF. +c 
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almost surely and in probability. 


PROOF: Let e > 0. Then by the law of large numbers 


n 
= ) y(X,-c-e) + Mcte) = Ey[X-(cte)] 
i=1 
1 n 
= yj CX, ~ere) Ben Ce-c) = Ep[(xX=(e=eyg 
n "i=l ‘ 


almost surely and in probability. Hence, by monotonicity of jy, 
for almost all sample sequences, (ce) < [T (X)] < (cte) holds 


for some n_ on, and similarly 


P {c-e < [T (X)] <cte} +1. 


LEMMA 4: ("Asymptotic normality.'") Assume (i) A(c) = 0, (ii) A(C&) 

if dirverentiable at. -& = c..,and) A'(c) < 0, , (iii) fu’ (t-&) dF(t) 

is finite and continuous at €& = c. Then n*(T_(X)=c) is asymptotically 
normal with asymptotic mean Q and asymptotic variance 
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PROOF: Without loss of generality, suppose c = 0. 


We have to show, that for every fixed real number gg, 


P{n? bgt sit 8 acd 14-9. aie 
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mean value theorem. 
apt n 
We shall see, that n : ) Y; is asymptotically normal 
i=1 
with mean O and variance 1, hence ee d(g). The 
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are independent, identically distributed random variables, but they 
are different for different values of n, therefore, the normal 
‘convergence criterion, as given in Loéve [11], (1960), p. 295 will 
be applied. 

The criterion states, that the distribution of oes ) y 
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we have 
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the integration being extended over the set E" = {Jucx,) | = Me} S46, 
This proves the theorem. 
for the remainder of this section, let ¢ = 0. If. y 
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III1.4 Minimax Approach 

Asymptotic Fag inke eau of robust estimation in the case 
of bounded v() = p') can be justified from the fact that frequently 
the sample size is perhaps large enough to indicate deviations from 
the assumed model, but not yet large enough to establish their nature. 

In the case of contaminated normal distribution F = (l-e) é+eH, 
this means, that the asymptotic minimax theory would be appropriate 
whenever the sample size is fairly large, but e.n, the average 
number of outliers is still rather small. 

We shall treat a special case and solve it by a direct 
verification of saddlepoint property. 

Let C be a set of all distributions of the form 
F = (l-e)GteH , where O<e< 1 is a fixed number, G is a fixed 
and H a variable distribution function. 

Assume, that G has a convex support and a twice continuously 
differentiable density g, such that -log (g) is convex on the 
support of G. 

Let Th be an M-estimate corresponding to a certain pp, 


let yp # 0! be the derivative of p and let c_ be such that 
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We shall try to minimize the supremum sup V(),F) 
F d of the 
asymptotic variance only for those pairs (W,F) , for which 


c = 0. 


THEOREM 1: The asymptotic variance F(WV,F) has a saddlepoint: 
there is an aa = (1-e)GteH and a Vo such that 

ap VCb oF) = Vive, Ba) & imt Vlbeks), { 

F Oo 6ti6 ) ) 


where F ranges over those distributions in C, for which 


Ey Vo = 0 | 
Let ty < th» be the endpoints of the interval, where 
ol < k (either or both of these endpoints may be at infinity), and 
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Remark: The statement of this theorem, is unsatisfactory insofar 
as the class over which H _ ranges depends on vo: This could be 
avoided by restricting G to be symmetric, and letting H range 


over all symmetric distributions. 


PROOF: The total mass of Fy as’. 1, ‘since 
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From this it follows that H = [F -C-e)6] has a total 


mass 1. It remains to check, that ee is non-negative. But 
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g(t) < g(t)e . This implies the non-negativity of h(t). 
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The right side is an obvious upper bound for V(W ,F) 
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provided EB Wes = 0, so we have VC) oF) < Vie Fe) fore all 
Pee Cs 
The inequality VC ae 2 < V(y,F) follows directly from 


inequality (3.1) of the previous section, noticing that 


This proves the theorem. 

It can be shown, using results of Le Cam [9], [10], (1953), 
(1958), that the (M)-estimate, corresponding to Vo> minimizes the 
maximal asymptotic variance not only among (M)-estimates, but even 
among all translation invariant estimates. 

The Bee Gaabibas of Theorem 1 are satisfied, if G= 4 is 
the standard normal distribution, with density (t) = (20) 7? exp{- st") 
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CHAPTER IV 


INTERMEDIARIES BETWEEN SAMPLE MEAN 


AND SAMPLE MEDIAN 


Wel, introduction 

In this chapter some new estimates of location which, in a 
sense, lie between sample mean and median will be presented. 

For the one-sample problem of estimating a single location 
parameter yw (-~ < up < ©) of a continuous and symmetric distribution 
F(x-u) i.e., the center of symmetry of the underlying distribution 
F(x-u), we accept the model of indeterminacy, with the prototype dis- 


tribution of the form: 


Fy) = Ce) 0G) + eo) : 


where 6(x) is the standard normal distribution, and, ¢€ is 


: , 1 , 
a fixed fraction of contamination 0 eh >? and h is the scale 


A 
= 
A 
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ratio of the two component normal distributionssel) <th = 
with res fixed. 

We shall confine our attention to the class of estimates of 
the unknown center of symmetry wu, which are intermediaries between 
the sample mean and sample median in the sense, that the sample mean 
and sample median will be limiting cases in the class of estimates. 
To determine a most robust estimate in the class of estimates, we 
will sien as a measure of robustness of an estimate the inverse 


of the supremum of the asymptotic variance, when F 1-H) ranges 
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over the set 
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GQ. F = {F, Ge-u)F, Gen) = 1-9 0G-u tea, 


* 
Rahch: <a} & 


for e€ and he fixed. In a practical situation the class F would 
be relevant if one can assume that the variance of the contaminating 
distribution ("wider" normal distribution) has an upper bound. 

The reason for confining our attention to intermediaries 
between sample mean and sample median is the following. 

The concept of a robust estimate even in the simplest case 
of the one-sample problem of estimating a single location parameter 
of a symmetric and continuous distribution permits different inter- 
pretations. Intuitively it corresponds to an estimate, (or rather 
to a sequence of estimates), which (i) can tolerate at least a 
few outliers, i.e., which can tolerate at least a few grossly erroneous 
observations in the sample, but at the same time it (ii) has small 
asymptotic variance when there is no contamination present. Since 
in our accepted model of indeterminacy the sample mean satisfies 
(ii) but not (i) and the sample median satisfies (i) but not (ii) 


we tried for some compromise. 


IV.2 Estimates for the One-Sample Location Problem. 
In this section we shall introduce two classes of estimates, 
in which the estimates are intermediaries between sample mean and 


sample median (in the above sense), and which are based on minimal 


principle. 
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n ~, eee n 
Ue Pee ima = minimum <=> ee = sample median 
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and 
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be two classes of estimates corresponding to functions p, (t3A) and 
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(i) A.M) = 0 


(ii) ACE) is differentiable at €— =» and A'(y) < 0 
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The integrability 
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This shows that | A (é) is differentiable at &— =u. Since 


A, = Mi) < 0, we have that requirement (ii) of Lemma 4 is satisfied 
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tf g,(t,eydt + fg (t,£)dt} 
Oo Oo 
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similarly. 

(c) Me eimakions g, (t,§), i = 5,6,7,8 are continuous functions 
in & vet € =u for all t ¢€-(0,°), which follows directly from the 
expressions for the functions g,(t,§), if= 5 6/7, 8 

(d) The absolute values of the functions g,(t,&), LSS, 7 60 
are bounded by a function G,(t) defined by: 
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This shows that Le) is continuous at & = uy, and I, (h) 
* xk 
finite for all O<e<>, O<A<1, 1<h<h << with h 
fixed, hence requirement (iii) of Lemma 4 is satisfied for j = 
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ey) (in2F., Go)) + 2v[u-§] 


which is a differentiable function in €&, and hence A, C8) is 


differentiable at § = uw. Further 
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IV.4 Mininax Solution 
Now that the requirements of Lemma 4 are satisfied for 
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THEOREM 1: There exists a unique most robust estimate ri in the 
class cae and a unique most robust estimate Ty in the class e 
corresponding to the choice of parameters A = A*(e) € [0,1] and 

v = v¥(e) € [0,1], which minimize the suprema of the asymptotic 


variances V[p, (t,A) FP p bth) | and Vivo (t5v) sF n6tn) | respectively, 


the suprema being taken over the set 


t= 
= F =— $ _ = _ a ee 
(4.3) F={ ene ut) ahs uw) = (l-e) (t-p) +e o( pores 
* 
Le a ti 1 eee 
of all possible underlying distributions for the one-sample problem 
of estimating a single location parameter u (-~ < p<), where 


® is the standard normal distribution function and 0 Se <5 


* 
and h are fixed. 


PROOF: It will be convenient to distinguish two cases, the case l, 


ee. (0, 5 and the case 2, ¢€ = 0. 


Case 1. Suppose e «€ (0, 41. It is then seen that both 
asymptotic variances (4.1) and (4.2) are increasing functions of 
the scale ratio h, since the numerator and the denominator of the 
expression for the asymptotic variances are non-decreasing and non- 
increasing functions of h respectively for 0 <i, v <1 and 
@) <ex<s fixed. Since the distributions Eaten ete range over 
the set F defined by (4.3), the superma of the asymptotic variances 
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Since our measure of robustness is the inverse of the 
supremum of the asymptotic variance over the set F of all possible 
underlying distributions, the most robust estimates in the classes 
:, and c.. will correspond to the choices of parameters 
A= Am(e)°-e [0,1]. atid -v = v¥(e) ¢ [0°.11,) which will minimize 
the suprema of asymptotic variances. 


Let us denote by W,(v) 5 the supremum of the asymptotic 


variance (4.2) over the set F, i.e., 
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It is easy to see that 
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(4BC-1)+(4B-4AC) aoe eR ace a 


if (4BC-1) < 0, which is equivalent to the condition 1 < h* < hy (e), 


where h, (e) (> 1) is given by 
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Let us denote by W, (A) the supermum of the asymptotic 


variance (4.1) over the set F, i.e., 
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from which by using the fact (see for example [15] hep. <250)ethat 
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Thus we have, that the function Wi) has an absolute 


minimum on [0,1] at A = (6) = 0 if 


Vv 
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A ow t 
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which is equivalent to the condition h > h, Ce) » and the function 
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which is equivalent to the condition I< ns h,(e), where 


h, Ce) (> 1) is given as a solution of 
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= log 4 for € € (0,5) 


uf 
[1l-e(1 - h, Ce) 


Case 2: Suppose e¢ = 0. Then the class of underlying distribu- 
tions F reduces to a single member E. nt») = $(t-u). The 
’ 


asymptotic variances (4.1) and (4.2) are then of the following forms: 
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It is easy to see that the asymptotic variances are minimized in 


both instances for. A =v = 1. This proves the theorem. 
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Remark: From the above theorem we have 
In the case when no contamination is present, i.e., when 
€=0, the most robust estimates for the location parameter Tye a 


the classes E and e. correspond to tT =." 


1 en is which is the 


sample mean. 
In the case when ee (0, 41, the most robust estimate 
for the location parameter yw in the class c corresponds to 
te i.e., to the sample median if the asymptotically least favour- 
able distribution in the class F has scale ratio ees h, Ce). 
This implies that the most robust estimate in c. cannot correspond 


to the sample median if the scale ratio 


# 
L <4hh «<3 infteh. (e) #1666 
My ae | 
O0<e< 
—2 
In the case when ec € (0, 41, the most robust estimate 


for the location parameter yu in the class E, corresponds to 


n 


T\ye0? 


i.e., tothe sample median if the asymptotically least favour- 
* 
able distribution in the class F has scale ratio h > hy(e). 


This implies that the most robust estimate in E., cannot correspond 


to the sample median if the scale ratio 


1 < in < inf h, (e) = 4.0 
O<a<s 
In the case when e € (0, 41, and when the scale ratio 
h™ of the asymptotically least favourable distribution in F lies 
| a < h,(e), the most robust estimate for the location 


between 


parameter wu corresponds to the choice of parameter A = A¥*(e) in 


the dantexval (0,1). 
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In the case when € € (0, 41, and when the scale ratio 
he of the asymptotically least favourable distribution in F lies 
between 1 < i < h,(e€), the most robust estimate for the location 
parameter wu in E, corresponds to the choice of parameter 
v= hat in the interval [0;1). 

Thus if the scale ratio h™ of the asymptotically least 


favourable distribution in fF, 


* 
He oe Wee oe h, (e) = 4.0) 


b] 


O<A<> 


then the most robust estimate in any of the classes c or c, 


cannot correspond to the sample median. 
* 
Suppose now h = 3. Then the class F of underlying 
distributions for the one sample problem of estimating a location 


parameter wu reduces to 


CRrir prone J fee CERES, (onm.oF (1-e) 0(t-u) +e6(— 


; ; 4 n 
The suprema of asymptotic variances of the estimates T) € c and 


tT" € c., are attained in both instances for an asymptotically least 
v 


favourable distribution function in fF, corresponding to the scale 


* * 
ratio heh = 3, i.e., when 


t-y 
a (to) = Fe nie S® Pe? = (1-e) (t-y) +e o(-—~) 
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is the Tukey's contaminated normal distribution. 


The most robust estimate in c, corresponds to the choice 


of parameter v = v*(e), given by 
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(4.12) RCC) means ae) Be es 
3(7-2) + 8e(1-e) [4/20 - 1] 


for e¢€ [0,1], and the most robust estimate in 7 corresponds 


to the choice of parameter i = A*(e) given as a solution of 


(4.13) S {V[p, (t3A) ,F (t-uw)J} =O , 


€ ,h*=3 


for e¢ ¢€ [0,1], from which no easy explicit expression for A*(e) 
can be obtained. 

From expression (4.12) it can be seen, that for the amount 
of contamination e« = 0, the best choice of v is vw = v¥(e=o) = 1, 
i.e., the most robust estimate for the location parameter wy cor- 
responds to the sample mean oe . For the maximal amount of con- 
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tamination e€ = the best choice of v is v = v¥(e = ~) = 0.0663, 


> 5) 
i.e., the most robust estimate for the location parameter u 
ds to the estimate T. 

corresponds to e estima »20.0663 ° 

The following two tables show the suprema of the asymptotic 

n n ; 
variances of the estimates T and Ty > the suprema being taken 
over the set F defined by (4.11) - for selected values of A,v 
and amounts of contamination ce, i.e., the asymptotic variances 
correspond to the asymptotically least favourable distribution in 
the set F , which is the Tukey's contaminated normal distribution 
* . s 

with scale ratio h = 3. The values with asterisks correspond 


to the minima of asymptotic variances for different amounts of 
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TABLE 1 


The Supremum Over the Set F of the Asymptotic Variance 


of the Estimate iM Defined as a Solution of 
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TABLE 2 


The Supremum Over the Set F of the Asymptotic Variance of 


the Estimate i Defined as a Solution of 
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From Tables 1 and 2, coe seems to be preferable to 


ob j j l 
ee) Sor ale ee [0, 5]. 


From a practial point of view it is unpleasant that the 


most robust estimate depends on the amount of contamination 


Ey 


but it is an unavoidable difficulty. If we assume, that contamination 


: 1 
can occur in the whole range e€ [0, 51, 


: n n 
the estimates T arid ©2E seem to 


ern i= 


mle 


If we assume, that contamination can occur 


e e [0, then from Tables 1 and 2 the 


, 
To! 


seem to be reasonable choices. 
The averages of functions v*(e) 


; 4] and [0 yield the 


fig 
intervals [0 > 10! 


then from Tables 1 and 2 


be reasonable choices. 


in the smaller range 


P n 
estimates T 


and A*(e) over 


the 


following results, where 


the values for A*(c) were obtained by a graphical method: 


v¥(e)de = 0.4999; 


A*(e)de = 
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could be also 


reasonable candidates if the occurrence of contamination e€ 


i 
the ranges [0, 5] and ([0, io! respectiv 
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Now we shall turn to computational aspects of the estimates 


n n 
om and vy 
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First we consider the estimate oer 


LEMMA 1: Let x = (x) +++ 5%) be an arbitrary sample point and 
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let T" be the estimate defined for 0<v<1 by: 
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n n 
Ope ey C= ¥) ) x.-c |+ v ) A le = min <=>¢= T" 
i=l 7 elas : 


Then for each 0 Xv <1_= we have 


where x and x are the sample median and the sample mean respectively. 


PROOF: It is seen that Q Ce) is for each fdxed. y <« [0,1] a 


convex function in oc, 


n 
(4.14) Qf) = Le Ix, -c¢ | =min —>ce= Thon = x 
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" 2 
: n ws 
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n n 
Q (a) = Ne and OAS < Q(%) 
Thus in order to show that tT é (T ; T" )  f6r each (0 =e 7 
v v=o” ~v=1 es eran 


it is enough to show that 


Oyta- > Q (a) and Q (b) <_ Q (b+ 


for each A> OQ. We can write Q fe) in the form 


n n 9 
Rete) “avl=y) ) bk =e | +v J (x,-c) 
i=l izl 7 


= (1-v)Q (c) + vQ, (c) 


It is convenient to distinguish two situations. First let a= x 
and b =x. Then from the convexity of Q Cc)» Q,6e) and Q, (ce); 


and from the fact that x-A°< x * xX < xtA we have 


Qa) = Q(x) = (1-v)Q, (x) +vQ, (x) ¢ (1-v)Q, Gx) VQ, (A) 


WA 


(1-v)Q (x-A)+VQ, (-A) = Q CA) = Q (a-A) 
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Q,(b) = Q(x) = 1-V)Q, GQ, () ¢ VQ, Gta) VQ, (*) 
< (1+) Q, (HA) HQ, GHA) = Q Gers) = Q (dA) 
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For the case a=x and b= x by the completely analogous argument 
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and 


Q,(b) < Q)(b+d) 


Thus we have Q (a-4) eta) and Qe) < Q,,(b+A) » which ter- 


minates the proof. 


LEMMA 2: Let x = (X) 5-6-5) be an arbitrary sample point, and 


ME < Yo Sees. |S La be the corresponding ordered sample point. Con- 


sider intervals I, for. 42= 0,1,.0.<,n ‘where I, = (Y5-Vy41) for 
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EROOF: From the definition of 1, we have for 0 <v <1: 


n n 

Q\ Cc) = (1-v) ) |c-x, |+v ) eB a = min <— c= tT 
re J ge: a he 
j=l j=l 

or equivalently for ordered x, 8? 

n n 3 

Q (c) = (1-v) } le-y,|+v ) (c-y,)° = min <==> ¢ = T" 

v pe | jel Z| ) 


Consider the function Q Ce) on an arbitrary interval I, for 
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The derivative of the function Q (ec) in the interior of the 
interval I, and. the right derivative at the leftmost point and 
the left derivative at the rightmost point are given by 
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The necessary and sufficient condition for the function Q,, Cc) to 


have minimum in the interval I Pee fe oe NL ss 


is that the right derivative of the function Q,,(c) on the 


interval I, at the leftmost point is negative and the right 


derivative of the function Q. Cc) on the interval Toad at the 
leftmost point is non-negative. 


The following two inequalities obtained from (4.20) 


express the above fact. 
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From the differentiability of the function Q Ce) on the 
interior of intervals I, i1=0,1,...,n it then follows, that 
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Example 1: Let the ordered Sample values for n=7 be 


CY 199.9 V3 0g V5 2% 29) = (2,3,4,6,7,8,33) 


Then x =.9 and x = 6. 

From Lemma 1 we know that for ve (0515 a € [x ,x] 
Let us find the value of the estimate aes For interval 
(Y5>¥¢] = (7,8] we have 


=, (n=21),.(1-\) i Heh 
* = LO = . t— —+ eS e 
c*(5) x + aie 7 i463 c*(6) 5 vA 


- The conditions (4.16) and (4.17) are satisfied for:the 


a a : 
interval (Yo >Y¢l, since. - 7 <7 wr and § > °5 ia hold. Since 
: i n= E 
* 2, a 
also c*(5) 7 14 ¢ (7,8], we have Te0.1 Zz 4° 


: n=7 
Example 2: For determination of T=0.05 for the same sample, let 


us consider interval (¥, 075] = (6,7]. We have 


13 


c*(4) =7—; c*(5) = 4 ah 
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The conditions (4.16) and (4.17) are satisfied for the interval 


13 


, Re } . 
iach. Since 6 <7 ie and 7 > 4 Ta" Since 
c*(4) = 7 af (6,7], we have T =0.05 7 Ys 


Thus we showed, that with the help of Lemmas 1 and 2 the 
evaluation of the pone, te - is quite simple. Besides if the 
value of v is fixed, then by evaluation of «¢*(1) for 41.2 1y2i..0m3 
we can easily obtain the estimate r 

Let us turn now to the estimate sf tty ee (x) 5+++5*)) 
is the ordered sample 


is any sample point and yy < Yo ei a Seas 


point, the estimate was defined by 
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n 
(4.21) Q, (ec) = ) hie} ecth’ sen (x.-c) = 0 <=> c¢ = tT 

jal J J rN 

for, OMe Ai< larg 

or equivalently 

2 h 
(a.22) Q, Cc) = ) ly.-c| Sen (y,-c) = 0 <=> c= 7 

jul a j A 


tor ee | ae 


To solve the equation (4.22) for some O <A <1 there is no 
simple method and apparently an iteration must be used. 
The obvious bounds for the solution (4.22) for any 
O <2 <1, are the maximum and the minimum value of the sample. 
That the analogy of Lemma 1 does not hold for the 
estimate can be seen, by considering a sample point for which 
the sample median x and sample mean x are equal. If the 
extimate tT would lie always between the sample mean and sample 


r 


median it should then be equal to the value x = x = T, for any 


O <A <1... But this is obviously not the case. 


From the Tables 1 and 2 it can be seen that the estimate 


: ; eal 
sie has generally lower asymptotic variances than the estimate t 


i oe é 
On the other hand the estimate T,, is easier to compute :-and hence 


os 
for a practical purpose the estimate T would be more useful. 


IV.5 Concluding Remarks 


For a simple model of indeterminacy, in which for the 


special case h* = 3, Tukey's prototype distribution function turned 


out to be the asymptotically least favourable distribution, we produced 
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two classes of estimates i € A and i € e as intermediaries 


between sample mean and sample median. 


We showed, using Huber's results that se and ie are 


asymptotically normal for 0 <A, v <1. On the basis of accepted 


measure of robustness of an estimate - the inverse of the supremum 
of the asymptotic variance of an estimate over the set of under- 
lying distributions - we determined the most robust estimates of 


the unknown location parameter wu in the classes E, and E.. 


which correspond to the choices of i = dA*(e) and v = v¥(e) . 


n n 
On the whole D&E) seems to be preferable to Te (e) for all 


n 


1 
ee [0, 5!- On the other hand Th &(e) 


is harder to compute than 


Also@cr is both translation and scale invariant 


n n 
Tye (e)* d*(e) 


n * 
whereas Tee) is only translation invariant. does not 


n 
Thee) 
always lie between mean and median whereas ae does always 


lie between mean and median. We recommend the use of the above 


estimates whenever the contaminating distribution also is normal and 
an upper limit can be set to the ratio of the variance of the contam- 


inating distribution to that of the contaminated distribution. 
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